AITopics | mathematical model

Collaborating Authors

mathematical model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A mathematical model for automatic differentiation in machine learning

Neural Information Processing SystemsDec-24-2025, 05:18:02 GMT

Automatic differentiation, as implemented today, does not have a simple mathematical model adapted to the needs of modern machine learning. In this work we articulate the relationships between differentiation of programs as implemented in practice, and differentiation of nonsmooth functions. To this end we provide a simple class of functions, a nonsmooth calculus, and show how they apply to stochastic approximation methods. We also evidence the issue of artificial critical points created by algorithmic differentiation and show how usual methods avoid these points with probability one.

automatic differentiation, differentiation, mathematical model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Quantum computers turned out to be more useful than expected in 2025

New ScientistDec-19-2025, 13:00:45 GMT

For the past year, I kept bringing the same story to my editor: quantum computers are on the edge of becoming useful for scientific discovery. Of course, that has always been the goal. The idea of using quantum computers to better understand our universe is part of their origin story, and it even featured in a 1981 speech by Richard Feynman. Contemplating the best way to simulate nature, he wrote: "We can give up on our rule about what the computer was, we can say: Let the computer itself be built of quantum mechanical elements which obey quantum mechanical laws." Today, Feynman's vision has been realised by Google, IBM and dozens more companies and academic teams. Their devices are now being used to simulate reality at the quantum level - and here are some highlights.

computer, particle, quantum computer, (15 more...)

New Scientist

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > United States > Maryland (0.05)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.05)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

Kong, Minwei, Qu, Ao, Guo, Xiaotong, Ouyang, Wenbin, Jiang, Chonghe, Zheng, Han, Ma, Yining, Zhuang, Dingyi, Tang, Yuhan, Li, Junyi, Wang, Shenhao, Koutsopoulos, Haris, Wang, Hai, Wu, Cathy, Zhao, Jinhua

arXiv.org Artificial IntelligenceDec-12-2025

Optimization modeling enables critical decisions across industries but remains difficult to automate: informal language must be mapped to precise mathematical formulations and executable solver code. Prior LLM approaches either rely on brittle prompting or costly retraining with limited generalization. We present AlphaOPT, a self-improving experience library that enables an LLM to learn from limited demonstrations (even answers alone, without gold-standard programs) and solver feedback - without annotated reasoning traces or parameter updates. AlphaOPT operates in a continual two-phase cycle: (i) a Library Learning phase that reflects on failed attempts, extracting solver-verified, structured insights as {taxonomy, condition, explanation, example}; and (ii) a Library Evolution phase that diagnoses retrieval misalignments and refines the applicability conditions of stored insights, improving transfer across tasks. This design (1) learns efficiently from limited demonstrations without curated rationales, (2) expands continually without costly retraining by updating the library rather than model weights, and (3) makes knowledge explicit and interpretable for human inspection and intervention. Experiments show that AlphaOPT steadily improves with more data (65% to 72% from 100 to 300 training items) and surpasses the strongest baseline by 7.7% on the out-of-distribution OptiBench dataset when trained only on answers. Code and data are available at: https://github.com/Minw913/AlphaOPT.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.18428

Country:

Asia (0.28)
North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Education > Educational Setting (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Adaptive tumor growth forecasting via neural & universal ODEs

Subramanian, Kavya, Joshi, Prathamesh Dinesh, Dandekar, Raj Abhijit, Dandekar, Rajat, Panat, Sreedath

arXiv.org Artificial IntelligenceDec-1-2025

Forecasting tumor growth is critical for optimizing treatment. Classical growth models such as the Gompertz and Bertalanffy equations capture general tumor dynamics but may fail to adapt to patient-specific variability, particularly with limited data available. In this study, we leverage Neural Ordinary Differential Equations (Neural ODEs) and Universal Differential Equations (UDEs), two pillars of Scientific Machine Learning (SciML), to construct adaptive tumor growth models capable of learning from experimental data. Using the Gompertz model as a baseline, we replace rigid terms with adaptive neural networks to capture hidden dynamics through robust modeling in the Julia programming language. We use our models to perform forecasting under data constraints and symbolic recovery to transform the learned dynamics into explicit mathematical expressions. Our approach has the potential to improve predictive accuracy, guiding dynamic and effective treatment strategies for improved clinical outcomes.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.22292

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling

Tang, Zhengyang, Ye, Zihan, Huang, Chenyu, Huang, Xuhan, Li, Chengpeng, Li, Sihang, Chen, Guanhua, Yan, Ming, Wang, Zizhuo, Zha, Hongyuan, Liu, Dayiheng, Wang, Benyou

arXiv.org Artificial IntelligenceOct-7-2025

Large Reasoning Models (LRMs) have demonstrated strong capabilities in complex multi-step reasoning, opening new opportunities for automating optimization modeling. However, existing domain adaptation methods, originally designed for earlier instruction-tuned models, often fail to exploit the advanced reasoning patterns of modern LRMs -- In particular, we show that direct fine-tuning on traditional \textit{non-reflective} datasets leads to limited gains. To fully leverage LRMs' inherent reasoning abilities, we propose \textbf{CALM} (\textit{Corrective Adaptation with Lightweight Modification}), a framework that progressively refines LRMs within their native reasoning modes for optimization modeling tasks. In CALM, an expert intervener identifies reasoning flaws and provides concise corrective hints, which the LRM incorporates to produce improved reasoning trajectories. These interventions modify fewer than 2.6\% of generated tokens, but generate high-quality data for soft adaptation through supervised fine-tuning. The adapted model is then further improved through reinforcement learning. Building on CALM, we develop \textbf{STORM} (\textit{Smart Thinking Optimization Reasoning Model}), a 4B-parameter LRM that achieves a new state-of-the-art average accuracy of 68.9\% across five popular optimization modeling benchmarks, matching the performance of a 671B LRM. These results demonstrate that dynamic, hint-based data synthesis both preserves and amplifies the native reasoning patterns of modern LRMs, offering a more effective and scalable path towards expert-level performance on challenging optimization modeling tasks.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.04204

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

Add feedback

Algorithms and data structures for automatic precision estimation of neural networks

Netay, Igor V.

arXiv.org Artificial IntelligenceSep-30-2025

We describe algorithms and data structures to extend a neural network library with automatic precision estimation for floating point computations. We also discuss conditions to make estimations exact and preserve high computation performance of neural networks training and inference. Numerical experiments show the consequences of significant precision loss for particular values such as inference, gradients and deviations from mathematically predicted behavior. It turns out that almost any neural network accumulates computational inaccuracies. As a result, its behavior does not coincide with predicted by the mathematical model of neural network. This shows that tracking of computational inaccuracies is important for reliability of inference, training and interpretability of results.

artificial intelligence, machine learning, precision estimation, (15 more...)

arXiv.org Artificial Intelligence

2509.24607

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An N-Plus-1 GPT Agency for Critical Solution of Mechanical Engineering Analysis Problems

Patera, Anthony, Abeyaratne, Rohan

arXiv.org Artificial IntelligenceSep-24-2025

Generative AI, and specifically GPT, can produce a remarkable solution to a mechanical engineering analysis problem - but also, on occasion, a flawed solution. For example, an elementary mechanics problem is solved flawlessly in one GPT instance and incorrectly in a subsequent GPT instance, with a success probability of only 85%. This unreliability renders "out-of-the-box" GPT unsuitable for deployment in education or engineering practice. We introduce an "N-Plus-1" GPT Agency for Initial (Low-Cost) Analysis of mechanical engineering Problem Statements. Agency first launches N instantiations of Agent Solve to yield N independent Proposed Problem Solution Realizations; Agency then invokes Agent Compare to summarize and compare the N Proposed Problem Solution Realizations and to provide a Recommended Problem Solution. We argue from Condorcet's Jury Theorem that, for a Problem Statement characterized by per-Solve success probability greater than 1/2 (and N sufficiently large), the Predominant (Agent Compare) Proposed Problem Solution will, with high probability, correspond to a Correct Proposed Problem Solution. Furthermore, Agent Compare can also incorporate aspects of Secondary (Agent Compare) Proposed Problem Solutions, in particular when the latter represent alternative Problem Statement interpretations - different Mathematical Models - or alternative Mathematical Solution Procedures. Comparisons to Grok Heavy, a commercial multi-agent model, show similarities in design and performance, but also important differences in emphasis: our Agency focuses on transparency and pedagogical value.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.18229

Genre: Research Report (0.50)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Automated Optimization Modeling through Expert-Guided Large Language Model Reasoning

Yang, Beinuo, Zhou, Qishen, Li, Junyi, Su, Chenxing, Hu, Simon

arXiv.org Artificial IntelligenceAug-25-2025

Optimization Modeling (OM) is essential for solving complex decision-making problems. However, the process remains time-consuming and error-prone, heavily relying on domain experts. While Large Language Models (LLMs) show promise in addressing these challenges through their natural language understanding and reasoning capabilities, current approaches face three critical limitations: high benchmark labeling error rates reaching up to 42%, narrow evaluation scope that only considers optimal values, and computational inefficiency due to heavy reliance on multi-agent systems or model fine-tuning. In this work, we first enhance existing datasets through systematic error correction and more comprehensive annotation. Additionally, we introduce LogiOR, a new optimization modeling benchmark from the logistics domain, containing more complex problems with standardized annotations. Furthermore, we present ORThought, a novel framework that leverages expert-level optimization modeling principles through chain-of-thought reasoning to automate the OM process. Through extensive empirical evaluation, we demonstrate that ORThought outperforms existing approaches, including multi-agent frameworks, with particularly significant advantages on complex optimization problems. Finally, we provide a systematic analysis of our method, identifying critical success factors and failure modes, providing valuable insights for future research on LLM-based optimization modeling.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.1441

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Freight & Logistics Services (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Large Language Models for Supply Chain Decisions

Simchi-Levi, David, Mellou, Konstantina, Menache, Ishai, Pathuri, Jeevan

arXiv.org Artificial IntelligenceJul-30-2025

Supply Chain Management requires addressing a variety of complex decision-making challenges, from sourcing strategies to planning and execution. Over the last few decades, advances in computation and information technologies have enabled the transition from manual, intuition and experience-based decision-making, into more automated and data-driven decisions using a variety of tools that apply optimization techniques. These techniques use mathematical methods to improve decision-making. Unfortunately, business planners and executives still need to spend considerable time and effort to (i) understand and explain the recommendations coming out of these technologies; (ii) analyze various scenarios and answer what-if questions; and (iii) update the mathematical models used in these tools to reflect current business environments. Addressing these challenges requires involving data science teams and/or the technology providers to explain results or make the necessary changes in the technology and hence significantly slows down decision making. Motivated by the recent advances in Large Language Models (LLMs), we report how this disruptive technology can democratize supply chain technology - namely, facilitate the understanding of tools' outcomes, as well as the interaction with supply chain tools without human-in-the-loop. Specifically, we report how we apply LLMs to address the three challenges described above, thus substantially reducing the time to decision from days and weeks to minutes and hours as well as dramatically increasing planners' and executives' productivity and impact.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.21502

Genre: Research Report (0.64)

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Counterfactual optimization for fault prevention in complex wind energy systems

Carrizosa, Emilio, Fischetti, Martina, Haaker, Roshell, Morales, Juan Miguel

arXiv.org Artificial IntelligenceJul-15-2025

Machine Learning models are increasingly used in businesses to detect faults and anomalies in complex systems. In this work, we take this approach a step further: beyond merely detecting anomalies, we aim to identify the optimal control strategy that restores the system to a safe state with minimal disruption. We frame this challenge as a counterfactual problem: given a Machine Learning model that classifies system states as either "good" or "anomalous," our goal is to determine the minimal adjustment to the system's control variables (i.e., its current status) that is necessary to return it to the "good" state. To achieve this, we leverage a mathematical model that finds the optimal counterfactual solution while respecting system-specific constraints. Notably, most counterfactual analysis in the literature focuses on individual cases where a person seeks to alter their status relative to a decision made by a classifier--such as for loan approval or medical diagnosis. Our work addresses a fundamentally different challenge: optimizing counterfactuals for a complex energy system, specifically an offshore wind turbine oil-type transformer. This application not only advances counterfactual optimization in a new domain but also opens avenues for broader research in this area. Our tests on real-world data provided by our industrial partner show that our methodology easily adapts to user preferences and brings savings in the order of 3 million e per year in a typical farm. Introduction Energy systems are becoming increasingly more complex, making it more challenging--and more critical--to detect faults early and develop strategies to mitigate them. In this context, Machine Learning (ML) techniques have become an industry standard for early fault detection [16]. Energy companies can monitor various sensor readings from the turbines and apply ML methods to identify potential issues with components. In this paper, we define a fault (or faulty state) as a condition where a component is in an unsafe status, while an anomaly refers to any irregularity that is not necessarily dangerous. Note that faults are a subset of anomalies. When a fault is detected, a controller is immediately activated to prevent severe damage to the turbine. Machine Learning models can detect anomalies in advance, providing companies with a window of time to intervene before faults occur.

controller, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.08849

Country:

Europe > Spain (0.28)
Europe > Italy (0.28)

Genre:

Research Report (0.82)
Overview (0.67)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.66)

Add feedback